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Abstract 

We consider the problem of exact support recovery of sparse signals via noisy measurements. The 

O ! 

^SJ , main focus is the sufficient and necessary conditions on the number of measurements for support recovery 

^ ■ to be reliable. By drawing an analogy between the problem of support recovery and the problem of channel 

coding over the Gaussian multiple access channel, and exploiting mathematical tools developed for the 
ff^ I latter problem, we obtain an information theoretic framework for analyzing the performance limits of 

support recovery. Sharp sufficient and necessary conditions on the number of measurements in terms of 
the signal sparsity level and the measurement noise level are derived. Specifically, when the number of 
^ ' nonzero entries is held fixed, the exact asymptotics on the number of measurements for support recovery 

is developed. When the number of nonzero entries increases in certain manners, we obtain sufficient 
^ , conditions tighter than existing results. In addition, we show that the proposed methodology can deal 

, with a variety of models of sparse signal recovery, hence demonstrating its potential as an effective 

OO ' analytical tool. 

cn 

o 
o 



I. Introduction 

Consider the estimation of a sparse signal X G in high dimension via linear measurements Y = 

■ ~^ ^' where A G K"x™ is referred to as the measurement matrix and Z is the measurement noise. A 

■ sparse signal is informally described as a signal whose representation in certain coordinates contains a 
large proportion of zero coefficients. In this paper, we mainly consider signals that are sparse with respect 
to the canonical basis of the Euclidean space. The goal is to estimate the sparse signal X by making as 
few number of measurements as possible. This problem has received much attention from many research 
principles, motivated by a wide spectrum of applications such as compressed sensing |[T1, Q, biomagnetic 
inverse problems 131, IB, image processing ||5l, 0, bandUmited extrapolation and spectral estimation 

The material in this paper was presented in part at the IEEE International Conference on Acoustics, Speech, and Signal 
Processing (ICASSP), Las Vegas, Nevada, USA, March 2008 and the IEEE International Symposium on Information Theory 
(ISIT), Toronto, Ontario, Canada, luly 2008. A short version of the paper was submitted to the IEEE International Symposium 
on Information Theory (ISIT), Austin, Texas, USA, 2010. 
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n, robust regression and outlier detection lUl, speech processing ||9l, channel estimation ifTOll . lITTI . echo 
cancellation |[T2l . llT3l . and wireless communication ifTOll . f\A\ . 

Computationally efficient algorithms for sparse signal recovery have been proposed to find or approx- 
imate the sparse solution X in various settings. A partial list includes matching pursuit ifTSl . orthogonal 
matching pursuit ||T6l , lasso ifTTl . basis pursuit [18], FOCUSS III, sparse Bayesian learning |[T9l . finite 
rate of innovation ll20l . CoSaMP f^T], and subspace pursuit |[22l . At the same time, many exciting 
mathematical tools have been developed to analyze the performance of these algorithms. In particular, 
Donoho m, Donoho, Elad, and Temlyakov 1231 . and Candes and Tao 12411 . and Candes, Romberg, and 
Tao |[25l presented sufficient conditions for ^i-norm minimization algorithms, including basis pursuit, to 
successfully recover the sparse signals with respect to certain performance metrics. Tropp |[26ll . Tropp 
and Gilbert l|27l . and Donoho, Tsaig, Drori, and Starck ||28l studied greedy sequential selection methods 
such as matching pursuit and its variants. In these papers, the structural properties of the measurement 
matrix A, including coherence metrics ifTSl . |[23l . |[26l . 1291 and spectral properties HI, ll24l . are used 
as the major ingredient of the performance analysis. By using random sensing matrices, these results 
translate to relatively simple tradeoffs between the dimension of the signal X, the number of nonzero 
entries in X, and the number of measurements to ensure asymptotically successful reconstruction of the 
sparse signal. In the absence of measurement noise, i.e., Z = 0, the performance metric employed is the 
ability to recover the exact sparse signal |[24l . When the measurement noise is present, the EucUdean 
distance between the recovered signal and the true signal has been often employed as the performance 
metric ||23l, |[25l . 

In many applications, however, finding the exact support of the signal is important even in the noisy 
setting. For example, in applications of medical imaging, magnetoencephalography (MEG) and elec- 
troencephalography (EEG) are common approaches for collecting noninvasive measurements of external 
electromagnetic signals |[30ll . A relatively fine spatial resolution is required to localize the neural electrical 
activities from a huge number of potential locations |[3ll . In the domain of cognitive radio, spectrum 
sensing plays an important role in identifying available spectrum for communication, where estimating 
the number of active subbands and their locations becomes a nontrivial task |32|. In multiple-user 
communication systems such as a code-division multiple access (CDMA) system, the problem of neighbor 
discovery requires identification of active nodes from all potential nodes in a network based on a linear 
superposition of the signature waveforms of the active nodes IT41. In all these problems, finding the 
support of the sparse signal is more important than approximating the signal vector in the Euclidean 
distance. Hence, it is important to understand performance issues in the exact support recovery of sparse 
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signals with noisy measurements. Information theoretic tools have proven successful in this direction. 
Wain Wright |[33l . |[34l considered the problem of exact support recovery using the optimal maximum 
likelihood decoder. Necessary and sufficient conditions are established for different scalings between 
the sparsity level and signal dimension. Using the same decoder, Fletcher, Rangan, and Goyal ll35l . ll36l 
recently improved the necessary condition. Wang, Wainwright, and Ramchandran ||37l also presented a set 
of necessary conditions for exact support recovery. Ak9akaya and Tarokh fSSl analyzed the performance 
of a joint typicality decoder and applied it to find a set of necessary and sufficient conditions under 
different performance metrics including the one for exact support recovery. In addition, a series of papers 
have leveraged many information theoretic tools, including rate-distortion theory |[39l . HOl . expander 
graphs PTl . belief propagation and list decoding B2l . and low-density parity-check codes P3l . to design 
novel algorithms for sparse signal recovery and to analyze their performances. 

In this paper, we develop sharper asymptotic tradeoffs between the signal dimension m, the number 
of nonzero entries k, and number of measurements n for reliable support recovery in the noisy setting. 
When k is held fixed, we show that n = (log m)/c(X) is sufficient and necessary. We give a complete 
characterization of c(X) that depends on the values of the nonzero entries of X. When k increases in 
certain manners as specified later, we obtain sufficient and necessary conditions for perfect support recov- 
ery which improve upon existing results. Our main results are inspired by the analogy to communication 
over the additive white Gaussian noise multiple access channel (AWGN-MAC) fl4l . fl31 . According 
to this connection, the columns of the measurement matrix form a common codebook for all senders. 
Codewords from the senders are individually multiplied by unknown channel gains, which correspond 
to nonzero entries of X. Then, the noise corrupted linear combination of these codewords is observed. 
Thus, support recovery can be interpreted as decoding messages from multiple senders. With appropriate 
modifications, the techniques for deriving multiple-user channel capacity can be leveraged to provide 
performance tradeoffs for support recovery. 

The analogy between the problems of sparse signal recovery and channel coding has been observed 
from various perspectives in parallel work 1391, Hi IV-D], 1371 II-A], 131 IH-A], m 11.2]. However, 
our approach is different from the existing literature in several aspects. First, we explicitly connect the 
problem of exact support recovery to that of multiple access communication by interpreting the sparse 
signal measurement model as a multiple access channel model. In spite of their similarity, however, 
there are also important differences between them which make a straightforward translation of known 
results nontrivial. We customize tools from multiple-user information theory (e.g., signal value estimation, 
distance decoding, Fano's inequality) to tackle the support recovery problem. Second, equipped with this 
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analytic framework, we can obtain a performance tradeoff sharper than existing results. Moreover, the 
analytical framework can be extended to different models of sparse signal recovery, such as non-Gaussian 
measurement noise, sources with random activity levels, and multiple measurement vectors (MMV). 

The rest of the paper is organized as follows. We formally state the support recovery problem in Section 
im To motivate the main results of the paper and their proof techniques, we discuss in Section |lll] the 
similarities and differences between the support recovery problem and the multiple access communication 
problem. Our main results are presented in Section ITVl together with comparisons to existing results in 
the literature. The proofs of the main theorems are presented in Appendices |Al |Bl |Cl and|Dl respectively. 
Section |V] further extends the results to different signal models and measurement procedures. 

Throughout this paper, a set is a collection of unique objects. Let M"* denote the m-dimensional real 
Euclidean space. Let N = {1, 2, 3, ...} denote the set of natural numbers. Let [k] denote the set {1, 2, k}. 
The notation \S\ denotes the cardinality of set S, ||x|| denotes the ^2 -norm of a vector x, and ||^||f 
denotes the Frobenius norm of a matrix A. The expression f(x) = o{g{x)) denotes lim^^-^oo = 0, 
f{x) = 0{g{x)) denotes < a|g((x)| as x — )• 00 for some constant a > 0, f{x) = @{g{x)) denotes 



f{x) = 0{g{x)) and g{x) = 0{f{x)), f{x) = ^{g{x)) denotes g{x) = 0{f{x)), and f{x) = i^{g{x)) 
denotes g{x) = o{f{x)). 



Let w = [wi, ...,Wk]'^ G M.^, where u^j / for all i. Let S = [^i, Sfc]""" G [m]^ be such that ^i, 
Sk are chosen uniformly at random from [m] without replacement. In particular, 5"^} is uniformly 

distributed over all size-/c subsets of [m]. Then, the signal of interest X = ^{w, S) is generated as 



Thus, the support of X is supp(X) = {S*!, S'fc}. According to the signal model ([T]), |supp(X)| = k. 
Throughout this paper, we assume k is known. The signal is said to be sparse when k <^ m. 
We measure X through the linear operation 



where A e M"^*" is the measurement matrix, Z € is the measurement noise, and Y G M" is the 
noisy measurement. We further assume that the noise Zi are independently and identically distributed 
(i.i.d.) according to the Gaussian distribution M{0,a'^). 



II. Problem Formulation 




(1) 



Y = ylX + Z 



(2) 
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Upon observing the noisy measurement Y, the goal is to recover the support of the sparse signal X. 
A support recovery map is defined as 

d:M"^2['"l. (3) 

Given the signal model ([U, the measurement model and the support recovery map (O, we define 
the average probability of error by 

Pe(w, A) ^ P{d(Y) ^ supp(X(w, S))} (4) 

for each (unknown) signal value vector w G M*^. 

III. An Information Theoretic Perspective on Sparse Signal Recovery 

In this section, we will introduce an important interpretation of the problem of sparse signal recovery 
via a communication problem over the Gaussian multiple access channel. The similarities and differences 
between the two problems will be elucidated, hence progressively unraveling the intuition and facilitating 
technical preparation for the main results and their proof techniques. 

A. Brief Review on the AWGN-MAC 

We start by reviewing the background on the A;-sender multiple access channel (MAC). Suppose the 
senders wish to transmit information to a common receiver. Each sender i has access to a codebook 
= {c^*\ \ c^''(i) }, where c^*^ G is a codeword and m^*) is the number of codewords in 
The rate for the ith sender is i?*^*) = (logm*^*))/n. To transmit information, each sender chooses a 
codeword from its codebook, and all senders transmit their codewords simultaneously over an AWGN- 
MAC ETl: 

Yi = hiXi^i + h2X2,i + ■ ■ ■ + hkXk,i + Zu / = l,2,...,n (5) 

where Xi^i denotes the input symbol from the ith sender to the channel at the /th use of the channel, hi 
denotes the channel gain associated with the ith sender, Zi is the additive noise, i.i.d. AA(0,(t^), and Y/ 
is the channel output. 

Upon receiving Yi, ...,y„, the receiver needs to determine the codewords transmitted by each sender. 
Since the senders interfere with each other, there is an inherent tradeoff among their operating rates. The 
notion of capacity region is introduced to capture this tradeoff by characterizing all possible rate tuples 
(i?*-^-*, i?*^^-*, R'^^'>) at which reliable communication can be achieved with diminishing error probability 
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of decoding. By assuming each sender obeys the power constraint ||c^*^|p/n < cj^ for all j E [m^*)] and 
all i G Nfc, the capacity region of an AWGN-MAC with known channel gains | [47| is 

: < llog (1 + ^5]/.^ ,V r C . (6) 

I ieT \ ieT J J 

B. Connecting Sparse Signal Recovery to the AWGN-MAC 

In the measurement model (O, one can remove the columns in A which are nulled out by zero entries 
in X and obtain the following effective form of the measurement procedure 

Y = X5^a5, + --- + Xs,as, +Z. (7) 

By contrasting ^ to AWGN MAC ([Sjl, we can draw the following key connections that relate the two 
problems |l44l. 

1) A nonzero entry as a sender: We can view the existence of a nonzero entry position Sj as sender 
j that accesses the MAC. 

2) a.j as a codeword: We treat the measurement matrix ^ as a codebook with each column a^, 
j € [m], as a codeword. Each element of a^^ is fed one by one to the channel ^ as the input 
symbol Xj, resulting in n uses of the channel. The noise Z and measurement Y can be related to 
the channel noise Z and channel output Y in the same fashion. 

3) as a channel gain: The nonzero entry Xs^ in dV) plays the role of the channel gain hi in 
Essentially, we can interpret the vector representation (|7]) as n consecutive uses of the /c-sender 
AWGN-MAC ^ with appropriate stacking of the inputs/outputs into vectors. 

4) Similarity between objectives: In the problem of sparse signal recovery, the goal is to find the 
support {S"!, of the signal. In the problem of MAC communication, the receiver's goal is 
to determine the indices of codewords, i.e., ^i, Sk, that are transmitted by the senders. 

Based on the abovementioned aspects, the two problems share significant similarities which enable 
leveraging the information theoretic methods for performance analysis of support recovery of sparse 
signals. However, as we shall see next, there are domain specific differences between the support recovery 
problem and the channel coding problem that should be addressed accordingly to rigorously apply the 
information theoretic approaches. 

C. Key Differences 

1) Common codebook: In MAC communication, each sender uses its own codebook. However, in 
sparse signal recovery, the "codebook" A is shared by all "senders". All senders choose their 
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codewords from the same codebook and hence operate at the same rate. Different senders will not 
choose the same codeword, or they will collapse into one sender. 
2) Unknown channel gains: In MAC communication, the capacity region ^ is valid assuming that 
the receiver knows the channel gain hi BSl . In contrast, for sparse signal recovery problem, X5. 
is actually unknown and needs to be estimated. Although coding techniques and capacity results 
are available for communication with channel uncertainty, a closer examination indicates that those 
results are not directly applicable to our problem. For instance, channel training with pilot symbols 
is a common practice to combat channel uncertainty |[49l . However, it is not obvious how to 
incorporate the training procedure into the measurement model Q, and hence the related results 
are not directly applicable. 
Once these differences are properly accounted for, the connection between the problems of sparse 
signal recovery and channel coding makes available a variety of information theoretic tools for handling 
performance issues pertaining to the support recovery problem. Based on techniques that are rooted in 
channel capacity results, but suitably modified to deal with the differences, we will present the main 
results of this paper in the next section. 

IV. Main Results and Their Implications 

A. Fixed Number of Nonzero Entries 

To discover the precise impact of the values of the nonzero entries on support recovery, we consider 
the support recovery of a sequence of sparse signals generated with the same signal value vector w. In 
particular, we assume that k is fixed. Define the auxiliary quantity 

,2 



c(w) = mill 

rc[fc] 



(8) 



For example, when k = 2, 



0(7^1, -0)2) = mill 



log 1 + ^^ , -log 1 + ^^ ,- log 1+ 1 2 



.2 V '^z y'2^V y'4^V '^z 

We can see from Section |lll] that this quantity is closely related to the 2-sender multiple access channel 

capacity with equal-rate constraint. 

In the following two theorems, we summarize our main results under the assumption that k is fixed. 

The subscript in rim denotes possible dependence between n and m. The proof of the theorems are 

presented in Appendices |A] and IE respectively. 
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Theorem 1: If 

log m 

limsup < c(w) (9) 

then there exist a sequence of matrices {A^™)}^^^, ^(™) g ]^™„xm^ ^j^^ ^ sequence of support recovery 
maps {ci(™)}^^;^, d^"^) : M"'" ^ 2^, such that 

^p(-)||^<a2 (10) 

and 



lim Pe(w,A('")) = 0. (11) 

m— >oo 

Theorem 2: If there exist a sequence of matrices {^(™)}^^;^, ^(""^ G M«-^xm^ 

a sequence of 

support recovery maps {d^™-)}^^^, d^™-) : M"™ i-^ 21"^], such that 



Umm 

and 



1 .p(-)||^<a2 (12) 



m— >oo 

then 



lim Pe(w,A(™)) = (13) 



log TTl 

limsup < c(w). (14) 

Theorems 1 and 2 together indicate that n = (log m) / (c(w) it e) is sufficient and necessary for exact 
support recovery. The constant c(w) is explicitly characterized, capturing the role of signal strength in 
support recovery. 

B. Growing Number of Nonzero Entries 

Next, we consider the support recovery for the case where k, the number of nonzero entries, grows with 
m, the dimension of the signal. We assume that the magnitude of a nonzero entry is bounded from both 
below and above. Meanwhile, we consider using random measurement matrices drawn from the Gaussian 
distribution, which makes it more convenient to compare with existing results in the literature. Note that 
we can easily establish corresponding results on the existence of arbitrary measurement matrices as in 
Theorems 1 and 2. 

First, we present a sufficient condition for exact support recovery. The proof can be found in Appendix 

o 



Theorem 3: Let {w^™)}^^^ be a sequence of vectors satisfying w^™") G M'^'" and < Wmin < 1""^^ I ^ 
< oo for all j G [km],m> 1. Let ^("^ e M"™xm generated as aJJ*^ ~ AA(0,a2). If 

< 1 (15) 



lim sup max 

m-s>oo rim je[fen.] 



log '-^^ + 1 



then there exists a sequence of support recovery maps {d'^™)}^^^, (i^*") : M"™ i— 2['"1, such that 
lim P{d(™)(^X(w(™),S) + Z) /supp(X(w(™),S))} = 0. 

To better understand Theorem 3, we present the following implication of ([T5] ) that shows the tradeoffs 
between the order of n versus m and k. 

Corollary 1: Under the assumption of Theorem 3, 

lim P{(i(™) (AXCw^"^) , S) + Z) / supp(X(w(") , S))} = 

provided that 

n = max < fif/clog A;), $7 ( log m] 1 . 

I Viogfc yj 

In particular, we have the following: 

1) When m = k^^^°^''\ for example m = k^'^^^, the sufficient number of of measurements is n = 

2) When e'^^'"^'^) < m < k"^^°^^\ for example m = k^^s^os^^^^ the sufficient number of of measure- 
ments is n = Q{k log k). In this case, log m = uj{log k), and hence Q.{k log k) <Q.{k log m). Thus, 
n = il(/clog/c) is a better sufficient condition than n = Q.{k\ogm). 

3) When uj{k) < m < e®(^°s'^), for example m = k"^, the sufficient number of of measurements is 
n = 0(/c log m). 

4) When m = @{k), the sufficient number of of measurements is n = Q.{k\ogm). 

The following table on the next page summarizes the sufficient orders of n paired with different 
relations between m and k in Corollary 1. 
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Relation between m and k 


Sufficient n 


m = Lu[k) 






g'^(iogfc) <rn< i!c°(iogfc) 


n = f2(fc log k) 




n = f2(fc logm) 


m = e(fc) 


n = n(fc logm) 



In the existing literature, Wainwright ||34]| and Akfakaya and Tarokh 113811 both derived sufficient 
conditions for exact support recovery. Under the same assumption of Theorem 3, the sufficient conditions 
presented in these papers, respectively, are summarized in the following table: 



Relation between m and k 


Wainwright (53 


Ak^akaya et al. 1 38 1 


m = u)(k) 


n = Q.(klogf) 


n = r2(fclog(m — k)) 


m = e(fc) 


n = Q,{m) 


n — Q{m) 



To compare the results, we first examine the case of m = Lo{k) (i.e., sublinear sparsity). Note that in 
the regime where m = e'^(^°e'=), our sufficient condition on n includes lower order growth rate, hence is 
better, than existing results. In the regime where uj{k) < m < 6®^^°^^)^ there exists a certain scenario, 
e.g., k = , ™ , in which our sufficient condition is of the same order as in 1381 but higher than in f34l . 



In the case of m = Q{k) (i.e., linear sparsity), we see that our sufficient condition is stricter, implying 
its inferiority to existing results in this regime. 

Next, we present a necessary condition, the proof of which can be found in Appendix iDl 



Theorem 4: Let {w^"*)},^^^ be a sequence of vectors satisfying vif^'") G M^™ and < 



(m) I 



< 



Wmax < oo foT all j G [krn],m > 1. Let e 

lim sup 



„xm, generated as A^^ 
2km \og{m/k 

m ) 



(m) 



AA(0,a2). If 



> 1 



(16) 



then for any sequence of support recovery maps {d^"^^}m=i, d^'^^ : R""" i— 2[™1, we have 
lim inf Pld^"") (AX(w(™) , S) + Z) / supp(X(w(™) , S))} > 0. 
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To compare with existing results under the same assumptiorQ of Theorem 4, we first note that when 
m = Q){k), the necessary condition is n = Q.{m), which follows simply from the elementary constraint 
n > k that the number of measurements has to be no smaller than the number of nonzero entries 
for support recovery to be possible. Contrasted by the sufficient conditions derived in 041 and |[38]| . 
n = Q,{m) is the necessary and sufficient condition for linear sparsity. When m = uj{k), we summarize 
the necessary conditions developed in previous papers in the following table: 



Relation between m and k 


m = u}{k) 


Wainwright (34) 


n — n(log m) 


Wang et al. (37) 




Akjakaya et al. |38l ^ 


j^,fclog(,n/fc)x 
V log k / 


Theorem 4 


„_J^(fclog(m/fc)) 

log k ' 



In this case, n = 

^^ fciog(m/fc) ^ is the best known necessary condition. 
C. Further Observations 

Note that for the sublinear sparsity with m = /c^(i°s'^), both log ^ and logm are of the same order 
and hence our sufficient and necessary conditions both indicate n = log m). This provides a sharp 

performance tradeoff for support recovery in this specific regime, which to our knowledge has not been 
observed in previous work (see, for example, the remarks in 041 III-A], 06l Ill-Remark 2)]). For the 
regime where uj{k) < m < k°^^°^^\ the orders of n in any pair of sufficient and necessary conditions 
have a nontrivial difference, leaving an open question on further narrowing the gap in this remaining 
regime of sublinear sparsity. 

In addition, it is worthwhile to note that our analytical framework could also be adapted to the case 
where Wmin = 0{1/Vk). This is a scenario extensively discussed in 04l . 06l . OTl . We will not pursue 
this direction in detail. 

'The necessary conditions derived in 1341 . 1371 . and 1381 were originally derived under slightly different assumptions. Here 
we adapted them to compare the asymptotic orders of n. 

^This result is implied in |38|, by identifying C4 in Thm. 1.6 therein, and clarifying the order of n. The proof of Thm. 1.6 
states that (below its (25)) asymptotically reliable support recovery is not possible if n < [log(l + Hw'""' \ \^ /a^)]~^mH{k/m) — 
log(m + l). Note that mH{k/m) — Q{klog{m/k)). Hence, we consider n — ^"^^^J''^ ) an appropriate necessary condition 
resulting from the proof in 1381 . 
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V. Extensions 

The connection between the problems of support recovery and channel coding can be further explored 
to provide the performance tradeoff for different models of sparse signal recovery. Next, we discuss its 
potential to address several important variants. 

A. Non-Gaussian Noise 

Note that the rules for support recovery, mainly reflected in (l22l) and ( |28] ) in the proof of Theorem 1 
in Appendix |Al are similar to the method of nearest neighbor decoding in information theory. Following 
the argument in |50], one can show that by replacing the assumption in Q on measurement noise 
Zi ~ 7V(0,cj^) with Var(Zj) = cr^, the results in the previous theorems continue to hold. 

B. Random Signal Activities 

In Theorem 1, w is assumed to be a fixed vector of nonzero entries. We now relax this condition 
to allow random W, which leads to sparse signals whose nonzero entries are randomly generated and 
located. For simplicity of exposition, assume that k is fixed. Interestingly, the model Q with this new 
assumption can now be contrasted to a MAC with random channel gains 

Yi = HiX^^i + H2X2,i + ■■■ + HkXk^i + Zu 1 = 1, 2, n. (17) 

The difference between (fTTl) and Q is that the channel gains Hi are random variables in this case. 
Specifically, in order to contrast the problem of support recovery of sparse signals. Hi should be 
considered as being realized once and then kept fixed during the entire channel use P4l . This channel 
model is usually termed as a slow fading channel BSl . 

The following theorem states the performance of support recovery of sparse signals under random 
signal activities. 

Theorem 5: Suppose W has bounded support, and limsup^_^oo ^TT^ = Let A^*") G j^n^xm 
generated as A^J^^ ~ J\f{0,a^)- Then, there exists a sequence of support recovery maps {d^™^}^^^, 
^(m) . ]gn,„ ^ 2H, such that 

limsupP{d(")(A(™)X(W,S) + Z) / supp(X)} < P{c(W) < r} 
where c(W) is defined in ([8]|. 
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Proof: Note that 

limsupP{d("^)(^('")X(W,S) + Z) / supp(X)} 
= lim sup / P{d("^) (^('")X(w, S) + Z) / supp(X)}(iF(w) 
= lim sup f P{d(™) (^('")x(w, S) + Z) / supp(X)}(iF(w) 

+ lim sup /" Pld^*") (^('")X(w, S) + Z) / supp(X)}dF(w) 

m-s>oo Jw:c(w)<r 

< /" limsupP{(i(™)(^(")X(w,S) + Z)/supp(X)}(iF(w)+ /" dF{w) (18) 

Jw:c{w)>r m-^oo Jw:c{w)<r 

< P{c(W) < r} (19) 

where (fTSl ) follows from Fatou's lemma lISTTl and (fT9l l follows by applying the proof of Theorem 1 to 
the integrand. ■ 

Theorem 5 implies that generally, rather than having a diminishing error probability, we have to tolerate 
certain error probability which is upper-bounded by P(c(W) < r), when the nonzero values are randomly 
generated. Conversely, in order to design a system with probability of success at least (1 — p), one can 
find r that satisfies P(c(W) < r) < p. 



C. Multiple Measurement Vectors 

Recently, increasing research effort has been focused on sparse signal recovery with multiple measure- 
ment vectors (MMV) ||52l - ||56l . In this problem, we wish to measure multiple sparse signals Xi(wi, S), 
X2(w2,S), and Xt(wf,S) that possess a common sparsity profile, that is, the locations of nonzero 
entries are the same in each Xt. We use the same measurement matrix A € M"^*" to perform 

Y = AX + Z (20) 

where X = [Xi, X2, Xt] e M™^*, Z = [Zi, Z2, Zf] € M"''* is the measurement noise, and 
Y = [Yi, Y2, Yt] G M"^* is the noisy measurement. 

Note that the model Q can be viewed as a special case of the MMV model ( [20l ) with t = 1. The 
methodology that has been developed in this paper has the potential to be extended to deal with the 
performance issues with the MMV model by noting the following connections to channel coding B4l . 
First, the same set of columns in A are scaled by entries in different Xj, forming outputs as elements 
in different Yj. The nonzero entries of X can then be viewed as the coefficients that connect different 
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pairs of inputs and outputs of a channel. Second, each measurement vector Yj can be viewed as the 
received symbols at the jth receiver, and hence the MMV model indeed corresponds to a multiple-input 
multiple-output (MIMO) channel model. Third, the aim is to recover the locations of nonzero rows of 
X upon receiving Y. This implies that, in the language of MIMO channel communication, the receivers 
will fully collaborate to decode the information sent by all senders. Via proper accommodation of the 
method developed in this paper, the capacity results for MIMO channels can be leveraged to shed light 
on the performance tradeoff of sparse signal recovery with MMV. 

VI. Concluding Remarks 

In this paper, we developed techniques rooted in multiple-user information theory to address the 
performance issues in the exact support recovery of sparse signals, and discovered necessary and sufficient 
conditions on the number of measurements. It is worthwhile to note that the interpretation of sparse signal 
recovery as MAC communication opens new avenues to different theoretic and algorithmic problems in 
sparse signal recovery. We conclude this paper by briefly discussing several interesting potential directions 
made possible by this interpretation: 

1) Among the large collection of algorithms for sparse signal recovery, the sequential selection meth- 
ods, including matching pursuit ifTSl and orthogonal matching pursuit (OMP) |[T6l . determine one 
nonzero entry at a time, remove its contribution in the residual signal, and repeat this procedure 
until certain stopping criterion is satisfied. In contrast, the class of convex relaxation methods, 
including basis pursuit |[T8l and lasso lITTl . jointly estimate the nonzero entries. 

Potentially, the sequential selection methods can be viewed as successive interference cancellation 
(SIC) decoding [48] for multiple access channels, whereas the convex relaxation methods can be 
viewed as joint decoding. It would be interesting to ask whether one can make these analogies more 
precise and use them to address performance issues. Similarities at an intuitive level between OMP 
and SIC have been discussed in 1451 with performance results supported by empirical evidence. 
More insights are yet to be explored. 

2) The design of channel codes and the development of decoding methods have been extensively 
studied in the contexts of information theory and wireless communication. Some of these ideas have 
been transformed into design principles for sparse signal recovery BTI . B2l . B3l as mentioned 
in the introduction. Thus far, however, the efforts in utilizing the codebook designs and decoding 
methods are mainly focused on the point-to-point channel model, which implies that the recovery 
methods iterate between first recovering one nonzero entry or a group of nonzero entries by treating 
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the rest of them as noise and then removing the recovered nonzero entries from the residual signal. 
In this paper, we established the analogy between the sparse signal recovery and the multiple access 
communication. It motivates us to envision opportunities beyond a point-to-point channel model. 
As one important question, for example, can we develop practical codes for joint decoding and 
reconstruction techniques to simultaneously recover all the nonzero entries? 
3) Last but not the least, we return to one remaining open question from this paper. Recall that 
for sublinear sparsity, there exists a certain regime in which the tight bound on the number of 
measurements is not known yet. Can we further improve the result in this regime, thereby closing 
the gap between sufficient and necessary conditions on the number of measurements for arbitrary 
scalings among the model parameters? 
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Appendix A 
Proof of Theorem 1 

The proof of Theorem 1 employs the distance decoding technique |[50l . We will first randomly generate 
the measurement matrix A^"^^ and show that the error probability averaged over this ensemble tends to 
zero as m — oo. This naturally leads to the existence of a sequence of deterministic matrices for achieving 
diminishing error probability of support reconstruction. We randomly generate the measurement matrix 
^("^) with entries drawn independently according to A^^^ ~ A/'(0, cj^), i G [n„i],j € [m\. Let A^-™"^ 
denote the jth column of A^"^\ 

For simplicity of exposition, we describe the support recovery procedure for two distinct cases on the 
number of nonzero entries. 

Case I: k = 1. In this case, the signal of interest is X = 'K{wi, Si). Consider the following support 
recovery procedures. Fix e > 0. First form an estimate ly of |tt;i| as 



— I|Y|P -ct2 

rim " " ^ 



2 ■ (21) 

a 

Declare that si G [m] is the estimated location for the nonzero entry, i.e., d^"^\Y) = {si}, if it is the 
unique index such that 

— ||Y-(-l)WAf)||2 <a2 + eV2 (22) 

Tim, 
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for either g = 1 or g = 2. If there is none or more than one, pick an arbitrary index. 
We now analyze the average probabiUty of error 

P(^) = E.[Pe{wi,A^^^ = P{d("')(Y) / 

where the expectation is taken with respect to the random measurement matrix A^'^\ Due to the symmetry 
in the problem and the measurement matrix generation, we assume without loss of generaUty 5i = 1, 
that is, 

Y = wiA.^;^^ + Z 

for some wi. In the following analysis, we drop superscripts and subscripts on m for notational simplicity 
when no ambiguity arises. Define the events 

1 



Ss = {^q£ {1,2} such that ^\\Y -{-IfWAsf < crl + e^al \ , s£ [m]. 



Then 



Let 



P(^) < P [SI U (ur=2^s)) . (23) 



^aux = {w-\wi\ e {-€,€)] n |i||Y||2 - [wfal + a^] e (-e,e) 
Then, by the union of events bound and the fact that A'^ U B = A'^ U {B Ci A), 

m 

< P&) + P(^f ) + E n ^--)- (24) 

s=2 

We bound each term in (l24l ). First, by the weak law of large numbers (LLN), Imim^oo P('^aux) — 0- 
Next, we consider P{£f). If wi > 0, 

-\\Y - ^Aif = -\\wiAi + Z - W^Aif = iwi - + 2{wi - W)'^^ + (25) 

n n n n n 

For any ei > 0, as m — )• oo, by the LLN, 

p(\w^-Wc^ (-61, ei)| n 1^:^ -ale (-ei, ei)l ^ ^ L 



n 



Hence, we have for the first term in (|25 



Following a similar reasoning, for the second term in (1251) . 

\ Tl 
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and for the third term, 

IZIP 



G {al-ei,a^,+ei) ) ^ 1. 



n 



Therefore, for any ei > 0, 



lim P (-\\Y-WAif e (cj2 -ei,cj2 + ei) ) = 1 
m~^oo yn J 

which implies that 

lim P f-||Y - WA,f < al + eV^^ = 1. 

m->oo \n J 

Similarly, if wi < 0, 

lim P ( i||Y + WAif <(jI + e^al] = 1. 

m— s>oo J 

Hence, limm^oo P(<?i) = 0. 

For the third term in (l24l) . we need the following lemma, whose proof is presented at the end of this 
appendix: 

Lemma 1: Let < (3 < a. Let {ui}f^^ be a real sequence satisfying 

1 " 

- Vu? G (a-/3,a + /3). 
re ^-^ 

i=l 

Let {T/jlf^]^ be an i.i.d. random sequence where Vi ~ M{0, cr^). Then, for any 7 € (0, a — ^5), 
Continuing the proof of Theorem 1, we consider P{£s n faux) for s / 1. Then 

P{£s n < P{£s\£,u.) = I P{£s\{^ = y} n £^..) f {y\£^.MY ■ 

Since is independent of Y and W, it follows from the definition of <5aux and Lemma 1 (with a = 

wfa^ + fj^ and 7 = 0-^ + e^fi^) that 

||Y - (-if W/A,|r < a; + e'cj^ 1 Y = y } n iaux I < ^ 

n 

for g = 1, 2, if e is sufficiently small. Thus, 



P(£:,|{Y = y}n^:aux) <2-2 

and therefore 



,.2„2 , „2_ 
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which tends to zero as m — )• oo, if 



, . log m 1 - 
nm sup < - log 



a1 + e^al 



(26) 



Therefore, by (l24l ). the probability of error averaged over the random matrix A^'^\ P('?)> tends to zero as 
m — oo, if (l26l ) is satisfied, which in turn implies that there exists a sequence of nonrandom measurement 

iM ||2 

^0 \\f 



matrices {^q™""*}^^^, such that ^-^ll^o™''' Hp < (^1 and limm^oo Pe{wi,A^^'^) = 0, if is satisfied. 



im=k 

Finally, since e > is chosen arbitrarily, we have the desired proof of Theorem 1 . 

Case 2: k > 2. In this case, the signal of interest is X = X(w,S), where w = [wi, ...,Wk]^ and 
S = [S*!, S'fc]^. Consider the following support recovery procedures. Fix e > 0. First, form an estimate 
W of llwll as 



W 



-llYlp 



(27) 



For r, C > 0, let Q = Q(r, be a minimal set of points in M'^ satisfying the following properties: 

1) Q C Bk{r), where Bk{r) is the A;-dimensional hypersphere of radius r. 
ii) For any b G Bk{r), there exists w e Q such that ||w — b|| < |. 

The following properties can be easily proved: 

Lemma 2: 1) lim^^oo P (^W G Q{W , C) such that || W - w|| < = 1. 

2) (l{f,C) — |Q(^, C)l is monotonically non-decreasing in r for fixed C- 

Given W and e, fix Q = Q{W , e). Declare d{Y) = {si, S2, Sk} Q [m] is the recovered support of 
the signal, if it is the unique set of indices such that 

k 



<a^ + e (T„ 



(28) 



for some W € Q. If these is none or more than one such set, pick an arbitrary set of k indices. 
Next, we analyze the average probability of error 

?{£) = E[Pe(w,^)] = P{S^\y) / 

where the expectation is taken with respect to A. As before, we assume without loss of generality that 



Sj = J for j = 1, 2, k, which gives 
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for some w. Define the event 



■-31,82,. ..,Sk 



3W G Q and {s'l , Sn, si} = {si, S2, Sk} such that — 

n 



< + e a„ 



Then 



m = p\£i2,...,k^ I U 

.Si<---<Sk:{si,...,Sk}^[k] 



8. 



Sl,S2,...,Sk 



{£si,S2,...,Sk ^ "^aux) 



ySi<---<Sk:{si,...,Sk}j^[k] 
Si<---<Sk:{si,...,Sk}j^[k] 



(29) 



where in this case 



^^aux = - ||w 



n\-\\Z\\' -aiGi-e,e) 
I n 

We now bound the terms in ( [29l ). First, by the LLN, lim^^^oo P((f^ 
P(£:f 2 ... fc)- Note that, for any W e Q, 



0. Next, we consider 



= ^ - ^^■)(^' - ^OAJA, + ^ - ^^OAJZ + ^||Zf . (30) 

i=i i=i j=i 

By applying the LLN to each term in (l30l ). as similarly done in case 1, and using LemmaEl-l), we have 

lim P I 3W G Q such that - 
which implies that limm^oo ^{£12 k) — ^- 



k 






^ ^2 , 22 1 
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{ai,...,afc,z}e£' ai 



Next, we consider P{£si,so,...,Sk n<Saux) for {si,S2,-.-,Sk} / [k]. Note that 

P(^si,s„...,sJ{Ai = ai} n • • • n {Afc = afe} n {z = z} n £:aux) 

X /(ai, afc, z|£:aux)(iai • • • da^dz. (31) 

For notational simplicity, define ^ = cr^ + e^cr^, T = {si, S2, Sk} H [/c], T'^ = {si, S2, Sk}\T, 
and ^cond — {Ai = ai} n • • • Pi {A^ = aj^} n {Z = z} n iSaux- For any permutation (s'l, S2, s'^) of 
{si, S2, Sk} and any W G Q, 

Y - E A,, 



2 






'^cond 1 



\ n 



k 


k 


2 










'^cond 1 


i=i 









E^^A.- E w^jA,, + z 

i=i s'-aT 

1 II v^fc 



2 






^cond 1 



(32) 



Conditioned on Econd and the chosen Q, ^||X]j=i^jAj — X]s' gT ^J-^s;; + Z|| is a fixed quantity 
satisfying 

k 



E -I+Ek-^:^-: 

ie[fc]\r s^-er 



E -l + E 



2 I 2 r 



-ie[fc]\r s^er 

for some positive 5i that depends on w and e only, and is non-decreasing in e. Meanwhile, A^^ is 
independent of Ai,...,Afc, and Z for s^- G T'^. Hence, by Lemma [T] (with a = (X]je[fc]\T^| + 
Z]s'Gr('"^«3 ~ ^i)^)'^a + "^2 ^11^ 7 = '^2 + f^'^a)' <t32l ) is upper-bounded by 



log 



je[fc]\r s' er ^ i 



< 2" 



■ log 



3e[fcl\r ' 
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Hence, by the union of events bound, 



P{£s,.s,,...,sJ£cond) < V P 3W G Q such that - 

^-^ \ n 



{s[,...,s'^}={si,...,Sk} 
{si,...,s;,}={si,...,Sfc} We 



k 


i=i 




2 




<{ 


'^cond 1 



2 






'^cond 1 



< A:! - |Q| -2" 



je[k]\T 



<,^+,2„2 



Furthermore, conditioned on <Saux> 1^ < ||w|| +e and hence \ Q\ < (7(||w|| +e,e) by Lemma |2]-2). Thus, 



P(^si,s„...,s. n £,,,) < k\ ■ q{\\w\\ + e, e) • 2 



-flog 



3e[fcl\T ^ 



2 1 „2 . „2 



(33) 

m—k ' 



Note that the probabiUty upper-bound ( [33l ) depends on si,...,Sfc only through T. Grouping the {^^-j-^) 
events {£s-^,s2,...,Sk i^-^aux} with the same T, 



m < p&) + m,2,...,k) + E 

rc[fc] 



m — k 

k-\r\ 



A;! -gdlwll +e,e) • 2" 



E 

3e[fc]\7 



2 „2 , „2 



= HS!J + m,2,...,k) + kl-q{\M+e,e)- J] 2mi°s-.2-^'° 

rm 

which tends to zero as m — oo, if 



<,2 + ,2„2 



logm 1 
nm sup < lop 



^2 + ^2^2 



(34) 



2|r| 

for all T C [/c] . Similar to the reasoning in case 1 , it implies the existence of a sequence of nonrandom 
measurement matrices {vIq™''}^^;. such that ;7^||^o"^'' III' < and limm-s>oo -Pe(w, Aq™"-*) = if (l34l) 
is satisfied. Since e > is arbitrarily chosen, the proof of Theorem 1 is complete. 

Now, it only remains to prove Lemma 1. For simplicity, let 9 = a^. Denote Sn = \ Er=i(^i ~ ^«)^- 
The moment generating function of 5„ is 



(35) 



i=l 
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Note that {ui — Vij^jO is a noncentral random variable. Its moment generating function is given by 

W\ as E[e*("'~^')'/^] = exp(Qf )/(! - 2t)i for t < 1/2. By changing variable Ot/n t, we have 



.f ir\2/ gl-2f)t/T, 

^|-gi(M.-yO /"•] 



(1 - 26't/n)2 
Back to (l35l ). we obtain 

n r) n i n-^i = l 



gl-2(9t/Ti g 1-26't/Ti 



i=i .^^(l-20t/n)i (l-20t/n)f 

The Chernoff bound implies 



P{Sn < 7) < mine"^E[e 

s>0 



g l + 2£)s/7i 



s>o (l + 26's/n)2 



mine 



g i-2e'p//i 



P<o {l-29p/n)- 

g i-2ep/Ti 

exp ^ min ^ loge"^''' 



Define 



p<o I (1 - 2ep/n)2 

exp <! min <j -pj + / - - log (1 - 29p/n] 

P<o I 1 — lOp/n 1 



/(p) = -VI + / - ^ log (1 - IBvIn) 

1 — lOp/n 1 

5(A) ^ f{n\) = -nA7 + " ^ " ^^^^^ 



Clearly, minp<o /(p) = ™inA<o 9(A)- Denote 

1 " 

A ^ \ ^ 2 

n 

Then, let us focus on the minimization problem 



f nXus n . „/i A \ 

mm5r(A) = mm < -nXj + — — - - log (1 - 26'A) 

A<0 A<0 [ 1 — Zt/X 2 

= n.mm|-A7 + 3^-^log(l-2eA) 
= -n.max|A7-34^ + ^log(l-20A) 

V 

= A(a,,e,7) 
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It can be shown that the minimizing A is 



4^7 



and 



A(a„ 0, 7) = A*7 - + \ log(l - 2\*e) 

0:5 + 7 1 2a57 



K 9 + ^0'^ + 4a,7 
log — 
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Next, for fixed and 7, 



5A(a„0,7)_ «^ + ^^2a^2^+ V^M^4^ + 0(^1 + 



26* 



de 



202 



+ 



02(0 + ^02^4^)2 

20 \ 



1 



2(0 + + 4a,7) \ 2^02T4^ 



- + 7 , V 4Q;g7 + 02 
202 ^ 202 

For > 0, there is only one stationary point 0' = — 7, which is a solution to ^^'•""^'^^ 
the second derivative, 



de 



0. Check 



52A(a„0,7) 



502 



1 



2{as + 7)(«s -7) 



> 0. 



This confirms that 0' = ttg — 7 is the minimum point of A(as, 0, 7), for > 0. Hence, for fixed and 
7 with 7 < cKs, 



As a result. 



1 rv 

A(a„0,7)>A(a„0',7) = -log^. 

2 7 



P(5„ < 7) < exp <; min { -pj + "^1=^ / - 77(1 - ^ep/n] 



p<0 



= exp "I min g{\) 



1 - 20p/n 2 



= exp{-nA(Q;s,0,7)} 
< exp{-nA(as,0',7)} 



, n as 
= exp^--log(^- 



. n f a- (3 
< exp <j "2 log ( 
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Hence, by changing the base of logarithm, 



\ 1=1 / 

Appendix B 
Proof of Theorem 2 

The main techniques for the proof of Theorem 2 include Fano's inequality and the properties of entropy. 
It mimics the proof of the converse for the channel coding theorem HTl with proper modification. 

Assume there exist a sequence of measurement matrices {A^™)}^^^ and a sequence of support recovery 
maps such that 

^P(™)|||<a2 (36) 
Umm 

and limm-s>oo P((i*^™'-' (Y) / {^i, 5*^}) = 0. We wish to show that lim sup„_j.oo (log m) /n^ < c{w). 

For any T Q [k], denote the tuple of random variables {Si : I e T) by S{T). From Fano's inequality 
ll47l . we have 

H{S{T)\Y)<H{Si,...,Sk\Y) 

<\ogk\ + H{{Si,...,Sk}\Y) 

<logA:! + Pe(w,^('"))log(^'^^ +1. (37) 

For notation simplicity, let Pg™^ = Pelw,^^™-*). On the other hand, 

(\T\-l 

H{S{J')\S{T^)) = log J] (m - (fc - in) - q) 

\ 9=0 

= |T| logm - nei,„ (38) 

where T"^ = [k]\T and 

1 / 

ei,„ = - log ml-^l/ [] (m - (A; - \T\) - q) 

^ V 9=0 
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which tends to zero as n — )• oo. Hence, combining (1371 ) and (l38l) . we have 

in \ogm = H{S{T)\S{T'))+nei,n 

= I{S{T);Y\S{T')) + H{S{T)\Y, + nei,n 

< /(S(r); Y|5(r^)) + HiSiT)\Y) + nei,„ 

< /(5(r); Y|5(n) + logA:! + Pf ^ log Q + 1 + n6i,„ 

= f^/(y,;S(r)|yr'.'5(n) +logA:! + pfhog (^'^) + 1 + n6i,„ 

< ^ ihiYi\S{T')) - h{Yi\Si, ...,Sk)) + log A:! + ^ log (?) + 1 + "ei,n 

= f^(/i(y,|5(r'=)) - h{Z,)) + log A;! + Pf^ log (?)+! + ^ei,n 

where ( [39l ) follows since the measurement matrix is fixed and is independent of . . . ,Sk) 
Consider 

/i(yi|5(r)) = /I + Zi\s{T' 

<hi^ WjOi^s, + Zi 



(39) 



< i log ^ne ■ Var ^ 



(40) 



where the last inequality follows since the Gaussian random variable maximizes the differential entropy 
given a variance constraint. To further upper-bound (l40l ). note that 



Var ^ Wjai,s, + Zi 

Now 











2 






- E 

















ier 



jeT P=i 



and 



jeTleT 

jerier,ij^j jeT 

mm ^ m 

Yl Y rnim - 1) ^ ^ ^Y'^r -Y 



p=i q= 
1+V 



E E 



WjWl 



. m{m — 1) 



m \ ^ \ m 

Y^'i'P +-\Y'^1- ^("^) Y 



,p=i 



p=l 



where r(m) = EjerEier,i^j — )■ as m — )• oo. It can be also easily checked that 



E E 

jerieT,iy^j 



WjWl 



m(rn — 1) 



< 



and thus 



Var J2 Wj(^i,s, +Zi \ < \Y'^^ ~ 



Returning to (1391 ). we have 

n ^ 

in logm < X - log 

i=l 



2vre U E '^l - ■^("^)) ^ E <P + 



^ log(27rea2) 



+ logA;! + pi"hor'"' 



+ 1 + nei < 



( E^I-^m);^EE«'p + ^' 

\\ier / j=i p=i / 



n 



^log(27recj2) 



(m) , ( m 



+ \ogk\ + pr' lo. 



+ 1 + nein 



= i ( E ^1 - ^ + 1 j + ^' + ^^"^^ (r) + ^ + 

where (|4TI ) is due to Jensen's inequality and (l42l) follows from (l36l ). Therefore, 

log m log A:! + log C^;) + 1 + n„ei^ 



lim sup ■ 
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for all T ^ [k]. Due to the fact that log C^) < k log m, we have 

limcup ~ fe^^V|T|)logm logfc! + n^ei,„„ + 1 ^ 1 ( ^ + ^ y ^2 



for all T C [A;]. Since lirrim^oo -Pg"*^ = 0, we reach the conclusion 

logm 



lim sup ■ 



for all T ^ [A;], which completes the proof of Theorem 2. 

Appendix C 
Proof of Theorem 3 

We show that 

lim P{d(™) (^XCw^'") , S) + Z) ^ supp(X(w('") , S))} = 
provided that the condition 

6/cm log km + 2j log m 



lim sup max 



log 



+ 1 



< 1 



(43) 



is satisfied. Note that (1431 ) implies that n = max[il(/!; log A;), il(j^^ log m)], which in turn implies that 

k = o{n). 

We follow the proof of Theorem 1 in Appendix [A] Recall that in case 2 of the proof of Theorem 1 , 
we first proposed the support recovery rule (|28] ). Then, we formed estimates of the nonzero values, and 
used them to test all possible sets of k indices. The key step was to analyze two types of errors. On the 
one hand, the true support should satisfy the reconstruction rule ( |28l ) with high probability. On the other 
hand, the probability that at least one incorrect support possibility satisfies this rule was controlled to 
diminish as the problem size increases. 

By mainly replicating the steps in Appendix lAl with necessary accommodations to the new setting with 
growing number of nonzero entries, we present the proof of Theorem 3 as follows. 



1) We first modify the support recovery rule by replacing (1281) with 



2) The cardinality q{r, () of a minimal Q(r, Q can be upper-bounded by 



(44) 



r]ikr 
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for some r/i > 0. This can be easily shown by first partitioning the A;-dimensional hypercube of side 
2r into identical elementary hypercubes with side not exceeding ^ and then, for each elementary 
hypercube that intersects the hypersphere, picking an arbitrary point on the hypersphere within that 
elementary hypercube. The resulting set of points provides the upper bound above for q{r, C). 
3) Define a^^^ and a^^^ to be the largest and smallest eigenvalues of the matrix 

^ ,[Ai,...,Afc,^Z]T[Ai,...,Afc,^Z] 



respectively. We replace the definition of <Saux by 



^^aux = {W- ||w|| G (-£, e)} n {aLx € (1 - e, 1 + e)} n g (1 - e, 1 + e)} . 



Consider the asymptotic behaviors of the events. First, note that 



lYiP 



A 



VI|w||V2 + a2 



where 



yi|w|pa2+cr2 



is x-distributed with mean \/2 ^^^(^jl^'^'' and variance 



Then, ^/:^||Y||2 has mean ^/ "'^"^^i'^'^' ^^r"^/2(^^ and variance 



12^2 I ^2 



n 



n 



(45) 



2r^((n+l)/2) \ 
V^(n/2) )■ 

2r^((n+l)/2) N 
r^(n/2) )■ 



It has been shown |[58l that 



xr(x) 



lim — - 

^/x + l/lT{x + 1/2) 



1. 



Then, as n — oo, ^/;^||YP has asymptotic mean / ll'^ll ^.f^'^' and variance 



2^2 I ^2 



2^0-2 



Since 



A; = o(n), we have — > 0. Hence, limm-!>oo P{W — \\\\r\\ G (— e, e)} = 1. 

Second, cr^a^ and fj^jjj are shown f59l to almost surely converge to (l+g)^ and (1— g)^, respectively. 



where q = lim^^^oo \/ {k + l)/n = 0. Thus, limm_^oo Pl-^aux) = 0- 
4) Next, we analyze the probability that the true support satisfies the recovery rule. Note that 



k k 

i=i i=i 



Ai, Afc, — Z 

0"z 



w- W 



.^2 2 



w- W 



2 2 1 1 ns 7 1 1 2 I 2 2 

C^maxf^allw - W|| +cr^ax'7^- 



(46) 
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By using the fact that cr^^^ — )• 1 almost surely as n — )• oo and Lemma 2-1), we have linim^-oo 2 k) 
0. 

5) Now, suppose we have proceeded to a step similar to ( [32l ) (that is, to be exact, equipped with the 
modified rule (l44l) and a proper <Scond)- Define the auxiliary vector w' G M.^^^ as 

wj ifje[fc]\r, 

- Wi if j = G r, (47) 
ifj = A; + l. 



Then, 



= < 



1 

n 



Ai, Afc, — Z 



>(1-6)||W'||V,2 



W 



\ \jm\r 

From Lemma 1, it follows that (for sufficiently small e) 



1 



n 



i=i 



<(l + 6)^7,^ + 2eV2 



(1-..) E 



je[fc]\T 



2 „2 , „2 



6) With these modifications, we follow the proof steps of Theorem 1 to reach 



P(^)<P&) + P(^iV,fc) + A:!-g(||w||+6,6). E 21^1-^-- 2 

rc[fc] 



log m o ^ log (i+,,^2+2e2^2 



„ (l- = )(|TI™min''a+<'z) 
' ~ (l + e)^^+2,S„S 



< P&) + P('?r,2,...,fc) + ^! • 'Zdlwll + e,6) • 2^= • max 



2^ log m 



(48) 



Note that 

log ( kl ■ q{\\w\\ + e, e) • 2^ • max 



2i log m 2 



■ log 



< k log A; + A; log(?7iA;^t(;max/e) + ^ + max 



J log m log 



(49) 



2 ^ (l + e)a2 + 2eV2 
It can be readily seen that from the condition (I43l l. the upper bound in ( |49l) becomes negative and 
thus P(iS) — )• as m — ;> 00. 
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Appendix D 
Proof of Theorem 4 

To establish this theorem, we prove the following equivalent statement: 

If there exists a sequence of matrices {^^"^^1^=^, A^"^) G ]^n,„xm^ ^j^^ ^ sequence of support recovery 
maps : M"- ^ 2ii'2.-.'"}, such that 



and 



then 



lim sup 



lim Pe(w('"),A(")) =0 

a— >oo 

2fcmlog(m//Cm) 



< 1. 



To justify this alternative claim, we follow the steps for the proof of Theorem 2 in Appendix 
Necessary modifications and clarifications are presented as follows. 
1) Note that 



ei,, = - log I ml'^l/ JT (m - (fe - in) - (z) I < — log 
" V to J " 

2) For any T ^ [k], we follow (l42l) in Appendix |B] to reach 

|T| log m — log A;! — P^^^ log 
Note that 

E E 



m — k + 1 



(50) 



1 - ?^el,n < lo: 




Wj — T(m) 



+ 1 



(51) 



— r(m) < 


E»J 


+ 


ier 







WjWl 



< \T\w, 



2 

max 



jerieT,if^j 

\T\{\T\-l)w ', 
(m — 1) 



{m — 1) 



2 

max 



< 2|rK,. 

Then, it follows from ^T^, and ([521) that the inequality 

|T| log m — k log k — P k log m — 1 — n • log ; 

n m — k + 1 

must hold for any T ^ [A;]. By choosing T = [k], we have 

2km (log(m - /Cm + 1) - log fcm - ^ ) 

^ 7 ^ — < 1 



(52) 



<-iog 2|rK,-f + 1 



lim sup ■ 
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2km (log m - log km) . -. 
™sup J- — ^ < 1. 

^-^^ n^logj^ '^-^r"' +lj 
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